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Machine learning algorithms can be used to forecast future blood glucose 
(BG) levels for diabetes patients, according to recent studies. In this study, 
dataset from continuous glucose monitoring (CGM) system was used as the 
sole input for the machine learning models. To forecast blood glucose levels 
15, 30, and 45 minutes in the future, we suggested deep neural network 


(DNN) and tested it on 7 patients with type 1 diabetes (T1D). The suggested 

prediction model was evaluated against a variety of machine learning 
Keywords: models, such as k-nearest neighbor (KNN), support vector regression (SVR), 
decision tree (DT), adaptive boosting (AdaBoost), random forest (RF), and 
eXtreme gradient boosting (XGBoost). The experimental findings 
demonstrated that the proposed DNN model outperformed all other models, 
with average root mean square errors (RMSEs) of 17.295, 25.940, and 
35.146 mg/dL over prediction horizons (PHs) of 15, 30, and 45 minutes, 
respectively. Additionally, we have included the suggested prediction model 
in web-based blood glucose level prediction tools. By using this web-based 
system, patients may readily acquire their future blood glucose levels, 
allowing for the generation of preventative alarms prior to crucial 
hypoglycemia or hyperglycemic situations 
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1. INTRODUCTION 

Blood glucose (BG) levels are increased in type 1 diabetes (T1D), a metabolic condition that is chronic 
and brought on by an inability to secrete insulin [1]. To obtain a close to normal glucose metabolism, T1D 
patients must give insulin through insulin pump or injection [2]. Patients with diabetes should regularly check 
their blood glucose levels to prevent hypoglycemia and hyperglycemia [3]. Blood glucose levels more than 180 
mg/dL are considered to be hyperglycemia (high blood sugar), whereas blood glucose levels below normal, or 
BG 70 mg/dL, are considered to be hypoglycemia (low blood sugar). Recent studies have demonstrated that 
whereas hypoglycemia is linked to increased short-term and long-term mortality [4], hyperglycemia can lead to 
long-term problems such retinopathy [5], renal disease [6], and cardiovascular disease [7]. 

The fourth industrial revolution, often known as Industry 4.0, is being driven by the internet of things 
(IoT) technology. Industrial processes may be automated, made more efficient, and decision-making can be 
improved by utilizing IoT to link and communicate with physical items via the internet. IoT-enabled sensors, for 
instance, may be used to monitor a variety of systems, including network monitoring systems [8], [9], 
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healthcare monitoring systems [10], [11], systems for monitoring automotive manufacturing [12], [13], and 
systems for monitoring the ambient conditions of rooms [14], [15]. Continuous glucose monitoring (CGM) 
systems are a sophisticated sensor technology that can be used to assess blood glucose levels. This wearable 
medical device uses an IoT-based approach and is able to provide real-time measurements of subcutaneous 
glucose concentration every 1 to 5 minutes for several days in a row [16]-[21]. A portable self-monitoring of 
blood glucose (SMBG) device must be used in a calibration method for CGM devices in order to get reference 
blood glucose levels. Additionally, the sensor data that has been generated by CGM device can be utilized as 
input feature for prediction models, so that future blood glucose levels can be obtained. 

Recent studies have demonstrated that utilizing sensor data from a CGM device as the only input, 
machine learning models may be used to estimate future blood glucose levels and enhance the treatment of 
diabetic condition [22]-[27]. The last 20 minutes' worth of blood glucose values were used as the input by 
Pérez-Gandia et al. [28] to predict the future blood glucose levels of 15 T1D patients. The artificial neural 
network (ANN)-based prediction model for forecasting blood glucose level was proven by Ben et al. [29]. To 
enhance prediction performance, the ideal number of features for each patient model were examined. The 
differential evolution (DE) approach was utilized by Hamdi ef al. [30] to optimize the prediction model's 
parameters. The suggested support vector regression (SVR) based prediction model performed better than 
previous machine learning based models by using a limited set of parameters. Based on long short-term memory 
(LSTM), prediction model for forecasting blood glucose levels has been proposed [31]. The suggested model 
was used to create predictions for the next 30 and 60 minutes using the past 60 minutes of CGM data as the 
input feature. To enhance the predictability of blood glucose levels, Alfian et al. [32] combined time domain 
information with the suggested ANN model. The analysis of 12 TID patients’ blood glucose levels 
demonstrated that the suggested model performed better than existing models. 

Studies from the past have shown that web-based system can incorporate a machine learning model. 
The trained support vector machine (SVM) model of Ahmed ef al. [33] was incorporated into a web application 
to correctly and effectively predict diabetes. Web-based machine learning program to identify natural gas source 
created by Snodgrass and Milkov [34]. But none of these earlier research have included deep neural network 
(DNN)-based prediction models to the web-based blood glucose level prediction system. Therefore, DNN was 
incorporated into the current study's web-based blood glucose level prediction system to improve the 
performance of BG prediction. To train a DNN model, sensor data from a CGM was used as an input feature. 
The trained model may now be used to forecast blood glucose levels 15, 30, and 45 minutes in the future. A 
number of machine learning models, including k-nearest neighbor (KNN), decision tree (DT), adaptive 
boosting (AdaBoost), random forest (RF) and eXtreme gradient boosting (XGBoost), as well as support vector 
regression (SVR) have been demonstrated to be effective at forecasting future blood glucose levels in 
previous studies [30], [32]. As a result, our approach will be evaluated for performance against various 
machine learning-based prediction models, including KNN, SVR, DT, AdaBoost, RF and XGBoost. 
Additionally, by incorporating the suggested DNN model into web-based blood glucose level prediction, it 
could make it easier for diabetic patients to obtain future blood glucose values, allowing them to take 
preventative measures earlier before critical conditions (hypoglycemic/hyperglycemic) arise. 


2. METHOD 

In this study, the sensor data from continuous glucose monitor (CGM) devices was collected and 
used to train machine learning models. Once training process completed, the models could predict future 
value of blood glucose levels of diabetic patient. The patients can be informed of the prediction's outcome, 
which is anticipated to aid in their improved disease management. Figure 1 demonstrates in detail the 
situation of creating a forecasting model utilizing particular blood glucose data from each patient. Data pre- 
processing has been done first, which entails dealing with the incomplete, inconsistent, and inappropriate 
data. The time-series dataset was also divided into train and test sets at this stage. Time series sensor data is 
then transformed into an input matrix (X) and an output vector (Y) in the feature extraction procedure, which 
comes after. Utilizing this set of matched inputs and intended outputs, the suggested prediction model is used 
to identify the pattern. During learning process, the proposed model tried to generate optimized model from 
train set by minimizing the prediction error. Once this process completed, the prediction output can be 
generated by applying this trained model to the test set. To evaluate the model performance, the output 
prediction was compared with the real values (ground truth) from test set, therefore model performance can 
be presented. Finally in the last step, trained model was integrated into web-based system and then prediction 
output can be presented to the diabetic patient easily through the web browser. 

Figure 2 presents the sensor data generated by CGM device from single diabetic patient. The value 
of blood glucose level was generated for every 5 minutes, and this time-series sensor data is then stored into 
database for further analysis. In this step, the sensor data is presented with the upper and lower limit, so that 
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the patient could easily understand whether their blood glucose level is fall in normal range, above upper 
limit (hyperglicaemia) or below lower limit (hypoglicaemia). For diabetic patients, maintaining blood 
glucose levels in the normal range is crucial, therefore machine learning model could be used to generate 
future value of blood glucose levels. By obtaining predicted blood glucose levels, the diabetic patient could 
avoid critical conditions such as hypoglicaemia and hyperglicaemia. 
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Figure 1. General framework for developing and assessing prediction models 
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Figure 2. History of CGM data from a diabetic patient 


By creating a unique model for each prediction time step, the direct technique is used in this study 
[32], [35], [36]. The direct strategy separately learns H and models f),, 


Vein = fr Ve-ntr) + w (1) 


with t € {n,...,N —H} and he {1,...,H} and returns a multi-step forecast by concatenating the H 
predictions. The time-series data was transformed into a set of paired inputs (X) and intended outputs (Y) to 
enable machine learning models to learn from train data. Let's use the terms G to denote a patient's time 
series sensor data and g; to denote a specific sensor value in the set, where g; € G and i=1,2,....N. N 
represents the total number of sensor data entries. Finally, given n previous values (or the window size), the 
forecasting horizon h and G the list of sensor data, the input X can be derived by creating an [(N-n-h + 
1) Xn] input data matrix, 
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Jnth 


Y=|,. (3) 


With one input layer, two or more hidden layers, and one output layer, we suggest a deep neural 
network (DNN) to predict the blood glucose level of diabetes patients. Using the backpropagation technique, 
the prediction model is trained [37], [38]. Each unit receives connections from every other unit in the 
preceding tier because, as shown in Figure 3, the model is fully linked. In two successive levels, there is a 
weight for each pair of units. The following step is to determine net input by multiplying each input by its 
associated weight, then adding them all together. In every unit of the hidden layer, the activation function is 
applied to a net input. By reducing the mean squared error between predicted and target values, the 
backpropagation method is utilized to update the weights, changing them after each training. The best model 
for the test set is obtained by doing this process repeatedly. 


Input Layer Hidden Layers Output Layer 


Figure 3. Proposed DNN model to predict future blood glucose levels 


The clinical data collection that we used, the CGM dataset from DirecNet, is freely accessible online 
at [39]. We utilize CGM sensor data from 7 individuals with type 1 diabetes (T1D) to assess how well 
machine learning models’ function. For around seven days, the Guardian-RT gadget produced the time-series 
data every five minutes. The time-series dataset was divided into two sections: training and testing, with the 
first 80% of the data used for training. In order to standardize the data's characteristics, min-max scaling was 
used. Xgboost, Scipy, and Scikit-learn are a few libraries in Python programming that we used to create the 
machine learning model [40]. The coefficient of determination (R?), root-mean-square error (RMSE) and 
mean absolute percentage error (MAPE) are the metrics we use to assess the effectiveness of machine 
learning models. The difference between the expected and actual values is represented by RMSE, whereas 
MAPE calculates the percentage of the error. The square of the correlation (R) between anticipated and 
observed values is how the coefficient of determination is finally established. This means that it varies from 0 
(no correlation) to | (complete correlation). 


3. RESULT AND DISCUSSION 

This section considers the proposed DNN model performance on predicting blood glucose levels in 
the future. We compare the performance of proposed model with other machine learning models. The dataset 
from T1D patients is used for comparison of machine learning models. We also demonstrate possible 
implementation of web-based prediction system by integrating with proposed DNN model. 


3.1. Performance evaluation 

For the CGM dataset, the suggested DNN model successfully predicted blood glucose levels. KNN, 
SVR, DT, AdaBoost, RF, and XGBoost were compared to the suggested blood glucose levels forecasting 
model. The performance of machine learning models was examined using the CGM sensor data from 7 
individuals with type | diabetes. For different prediction horizons (PHs), Table | displays the average and its 
standard deviation for RMSE, MAPE, and R? from the machine learning models. 
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The most recent 30 minutes of BG data and the values in the near future have a substantial 
correlation, according to earlier studies [32], [41]. The CGM data from the previous 30 minutes served as the 
feature input for the machine learning-based forecasting models. In this work, the machine learning models 
were used to predict BG levels using a window size of n=6 (the last 30 minutes) of CGM data. An individual 
forecasting model was created for each patient, and the performance average across the seven models was 
calculated (generated from 7 patients). In comparison to existing machine learning models, the suggested 
DNN model demonstrated the maximum performance by generating the lowest RMSE and MAPE. For 
prediction horizons (PHs) of 15, 30, and 45 minutes, the suggested DNN's RMSE values are 17.295, 25.940, 
and 35.146 mg/dL, respectively. The greatest R? (coefficient of determination) was likewise produced by the 
suggested DNN model, coming in at 0.925, 0.825, and 0.682 for PHs of 15, 30, and 45 minutes, respectively. 
The final finding was that all forecasting models produced greater RMSE and MAPE but lower R? as the 


prediction horizon grew. 


Table 1. Comparison of the effectiveness of forecasting models 


Model Metrics Prediction Horizon (minutes) 
15 30 45 
SVR RMSE = 47.224+412.665 52.0544 13.741 55.963 + 14.035 
MAPE =. 22.268 + 11.625 26.4244 12.836 29.568 + 13.201 
R? 0.438 + 0.158 0.314 + 0.187 0.205 + 0.199 
KNN RMSE 26.376 + 7.766 38.2104 12.486 49.010 + 12.920 
MAPE 12.495 + 4.381 19.703 + 8.671 26.468 + 11.417 
R? 0.820 + 0.071 0.614 +0.181 0.372 + 0.210 
DT RMSE = 32.926+13.163 42.149412.191 56.216 + 13.486 
MAPE = 15.294 + 6.793 21.026 + 6.908 30.800 + 10.494 
R? 0.716 + 0.178 0.537 + 0.178 0.175 + 0.250 
RF RMSE ~ 28.708 + 12.648 38.173 411.897 49.394 + 11.872 
MAPE 12.712 + 6.521 19.624 + 8.557 26.765 + 11.074 
R? 0.778 + 0.159 0.609 + 0.179 0.354 + 0.209 
AdaBoost RMSE — 28.533 411.471 37.2754 10.665 46.423 + 9.654 
MAPE — 13.342 + 6.038 19.466 + 7.399 25.684 + 10.132 
R? 0.786 + 0.132 0.633 + 0.139 0.431 + 0.158 
XGBoost RMSE = 30.301 412.727 40.743 413.197 52.712 + 14.251 
MAPE 14.049 + 6.474 21.180 + 9.624 28.782 + 12.506 
R? 0.759 + 0.160 0.559 + 0.200 0.268 + 0.279 
Proposed model RMSE — 17.295 + 4.233 25.940 + 6.030 35.146 + 8.457 
MAPE ~~ 7.478 + 1.828 12.356 + 3.479 17.681 + 5.276 
R? 0.925 + 0.016 0.825 + 0.047 0.682 + 0.084 


3.2. Practical implication 

The prediction of chronic illness [33], [42], [43], self-care [44], and preventive medicine [45] are 
just a few areas where previous studies have shown how machine learning models might be integrated into 
web-based systems to help with decision-making. Therefore, the goal of our study is to develop and deploy a 
web-based system for forecasting blood glucose levels, which will help the patient obtain future blood 
glucose readings. The medical team may also utilize this web-based system, which will aid them in selecting 
candidates for screening. 

The suggested forecasting model was implemented using the Scikit-learn package on the server side, 
and the web-based blood glucose level prediction system was written in PHP and Python programming as 
well as the Flask web framework. A mobile device or web browser on a computer can be used by the patient 
and the medical staff to access an application, as shown in Figure 4. Prior to being saved in a MySQL 
database, the sensor data from the CGM was first sent to the server side. Then, in order to forecast future 
blood glucose levels, our trained prediction model was applied to the latest 30 minutes of CGM sensor data, 
which was utilized as an input feature and then transferred to an application programming interface (API) 
built on Flask web framework. In the end, a web page is used to show the diabetic patient the prediction 
result of the expected value of blood glucose. Figure 5 displayed the CGM sensor data history as it was 
shown through a web-based system. In order to determine if their blood glucose levels are within the usual 
range or not, this depiction enabled diabetic patients to continuously track their blood glucose levels. Figure 6 
also displayed the suggested DNN's prediction results, which were shown on a web page. In order to forecast 
the future value of blood glucose levels up to the following hour, the suggested DNN prediction model 
utilised data from the CGM sensor's previous 30 minutes. 
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Figure 6. The web-based system presents the blood glucose values for the future 
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4. CONCLUSION 

In this work, a DNN model was suggested for forecasting blood glucose levels several steps into the 
future. Time series data was transformed into a set of paired inputs and desired outputs using the direct 
technique. Using this method, traditional machine learning models may train on this training set to produce 
predictions. Input features for the proposed DNN were taken from the BG values during the previous 30 
minutes. With the lowest RMSE and MAPE but the greatest R? when compared to other forecasting models, 
the suggested DNN model demonstrated the best performance. For prediction horizons (PHs) of 15, 30, and 
45 minutes, the suggested DNN's RMSE values are 17.295, 25.940, and 35.146 mg/dL, respectively. In order 
to enhance the effectiveness of the suggested system, we also integrated a DNN model into a web-based 
blood glucose level prediction system. The suggested DNN model might help diabetes patients to obtain 
future blood glucose values through their web browser, allowing them to take preventative measures sooner 
before severe conditions (hypoglycemic/hyperglycemic) arise. Given that the dataset used in this study 
included of just a small number of T1D patients, it is difficult to generate judgments about the overall 
effectiveness of the forecasting models (7 children). The dataset size might be increased, and a comparison 
with other forecasting models might be implemented soon. The study used dataset from CGM device that 
requires calibration to convert its electrical signal into glucose concentration in real-time. Calibration process 
is required to collect one or more reference BG values from a portable SMBG device. In addition, the CGM 
data may have missing values and outliers, therefore preprocessing step is crucial to maximize model 
prediction accuracy. 
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